Newly Released Capabilities in the Distributed-Memory SuperLU Sparse Direct Solver

نویسندگان

چکیده

We present the new features available in recent release of SuperLU_DIST , Version 8.1.1. is a distributed-memory parallel sparse direct solver. The include (1) 3D communication-avoiding algorithm framework that trades off inter-process communication for selective memory duplication, (2) multi-GPU support both NVIDIA GPUs and AMD GPUs, (3) mixed-precision routines perform single-precision LU factorization double-precision iterative refinement. Apart from improvements, we also modernized software build system to use CMake Spack package installation tools simplify procedure. Throughout article, describe detail pertinent performance-sensitive parameters associated with each algorithmic feature, show how they are exposed users, give general guidance set these parameters. illustrate solver’s performance time can be greatly improved after systematic tuning parameters, depending on input matrix underlying hardware.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Distributed CPU-GPU Sparse Direct Solver

This paper presents the first hybrid MPI+OpenMP+CUDA implementation of a distributed memory right-looking unsymmetric sparse direct solver (i.e., sparse LU factorization) that uses static pivoting. While BLAS calls can account for more than 40% of the overall factorization time, the difficulty is that small problem sizes dominate the workload, making efficient GPU utilization challenging. This ...

متن کامل

MUMPS : A General Purpose Distributed Memory Sparse Solver

MUMPS is a software package for the multifrontal solution of large sparse linear systems on distributed memory computers. The matrices can be symmetric positive definite, general symmetric, or unsymmetric, and possibly rank deficient. MUMPS exploits parallelism coming from the sparsity in the matrix and parallelism available for dense matrices. Additionally, large computational tasks are divide...

متن کامل

A distributed-memory hierarchical solver for general sparse linear systems

We present a parallel hierarchical solver for general sparse linear systems on distributed-memory machines. For large-scale problems, this fully algebraic algorithm is faster and more memory-efficient than sparse direct solvers because it exploits the low-rank structure of fill-in blocks. Depending on the accuracy of low-rank approximations, the hierarchical solver can be used either as a direc...

متن کامل

Extended Sparse Distributed Memory

Sparse distributed memory is an auto-associative memory system that stores high dimensional Boolean vectors. Here we present an extension of the original SDM that uses word vectors of larger size than address vectors. This extension preserves many of the desirable properties of the original SDM: autoassociability, content addressability, distributed storage, robustness over noisy inputs. In add...

متن کامل

Sparse Distributed Memory

Sparse Distributed Memory was proposed by Pentti Kanerva as a model of human long term memory. He presented it as an architecture that could store large patterns and retrieve them based on partial matches with current sensory inputs. The architecture can be realized as a neural net or as an associative memory. SDM exhibits behaviors, both in theory and in experiment, that resemble those previou...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: ACM Transactions on Mathematical Software

سال: 2023

ISSN: ['0098-3500', '1557-7295']

DOI: https://doi.org/10.1145/3577197